HOME

TheInfoList



OR:

The ILLIAC IV was the first
massively parallel Massively parallel is the term for using a large number of computer processors (or separate computers) to simultaneously perform a set of coordinated computations in parallel. GPUs are massively parallel architecture with tens of thousands of ...
computer. The system was originally designed to have 256
64-bit In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit central processing units (CPU) and arithmetic logic units (ALU) are those that are based on processor registers, a ...
floating-point unit A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multip ...
s (FPUs) and four
central processing unit A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
s (CPUs) able to process 1 billion operations per second. Due to budget constraints, only a single "quadrant" with 64 FPUs and a single CPU was built. Since the FPUs all processed the same instruction — ADD,  SUB, etc. — in
Flynn's taxonomy Flynn's taxonomy is a classification of computer architectures, proposed by Michael J. Flynn in 1966 and extended in 1972. The classification system has stuck, and it has been used as a tool in the design of modern processors and their functionalit ...
, the design would be considered to be
single instruction, multiple data Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
, or SIMD. The concept of building a computer using an array of processors came to Daniel Slotnick while working as a programmer on the
IAS machine The IAS machine was the first electronic computer built at the Institute for Advanced Study (IAS) in Princeton, New Jersey. It is sometimes called the von Neumann machine, since the paper describing its design was edited by John von Neumann, a ...
in 1952. A formal design did not start until 1960, when Slotnick was working at
Westinghouse Electric Corporation The Westinghouse Electric Corporation was an American manufacturing company founded in 1886 by George Westinghouse and headquartered in Pittsburgh, Pennsylvania. It was originally named "Westinghouse Electric & Manufacturing Company" and was ...
and arranged development funding under a
United States Air Force The United States Air Force (USAF) is the Air force, air service branch of the United States Department of Defense. It is one of the six United States Armed Forces and one of the eight uniformed services of the United States. Tracing its ori ...
contract. When that funding ended in 1964, Slotnick moved to the
University of Illinois Urbana-Champaign The University of Illinois Urbana-Champaign (UIUC, U of I, Illinois, or University of Illinois) is a public university, public land-grant university, land-grant research university in the Champaign–Urbana metropolitan area, Illinois, United ...
and joined the Illinois Automatic Computer (ILLIAC) team. With funding from the
Advanced Research Projects Agency The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adva ...
(ARPA), they began the design of a newer concept with 256 64-bit processors instead of the original concept with 1,024 1-bit processors. While the machine was being assembled by Burroughs, the university began building a new facility to house it. Political tension over the funding from the
United States Department of Defense The United States Department of Defense (DoD, USDOD, or DOD) is an United States federal executive departments, executive department of the federal government of the United States, U.S. federal government charged with coordinating and superv ...
led to the ARPA and the university fearing for the machine's safety. When the first 64-processor quadrant of the machine was completed in 1972, it was sent to the NASA
Ames Research Center The Ames Research Center (ARC), also known as NASA Ames, is a major NASA research center at Moffett Federal Airfield in California's Silicon Valley. It was founded in 1939 as the second National Advisory Committee for Aeronautics (NACA) laborat ...
in Mountain View, California. After three years of extensive modification to fix various flaws, ILLIAC IV was connected to the
ARPANET The Advanced Research Projects Agency Network (ARPANET) was the first wide-area packet-switched network with distributed control and one of the first computer networks to implement the TCP/IP protocol suite. Both technologies became the tec ...
for distributed use in November 1975, becoming the first network-available
supercomputer A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
, beating the
Cray-1 The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. Announced in 1975, the first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Eventually, eighty Cray-1s were sold, making it one of the ...
by nearly 12 months. Running at half its design speed, the one-quadrant ILLIAC IV delivered 50  MFLOP peak, making it the fastest computer in the world at that time. It is also credited with being the first large computer to use solid-state memory, as well as the most complex computer built to that date, with over 1 million
logic gate A logic gate is a device that performs a Boolean function, a logical operation performed on one or more binary inputs that produces a single binary output. Depending on the context, the term may refer to an ideal logic gate, one that has, for ...
s. Generally considered a failure due to massive budget overruns, the design was instrumental in the development of new techniques and systems for programming parallel systems. In the 1980s, several machines based on ILLIAC IV concepts were successfully delivered.


History


Origins

In June 1952, Daniel Slotnick began working on the
IAS machine The IAS machine was the first electronic computer built at the Institute for Advanced Study (IAS) in Princeton, New Jersey. It is sometimes called the von Neumann machine, since the paper describing its design was edited by John von Neumann, a ...
at the
Institute for Advanced Study The Institute for Advanced Study (IAS) is an independent center for theoretical research and intellectual inquiry located in Princeton, New Jersey. It has served as the academic home of internationally preeminent scholars, including Albert Ein ...
(IAS) at
Princeton University Princeton University is a private university, private Ivy League research university in Princeton, New Jersey, United States. Founded in 1746 in Elizabeth, New Jersey, Elizabeth as the College of New Jersey, Princeton is the List of Colonial ...
. The IAS machine featured a bit-parallel math unit that operated on 40-bit
words A word is a basic element of language that carries meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consensus among linguists on its ...
. Originally equipped with
Williams tube The Williams tube, or the Williams–Kilburn tube named after inventors Frederic Calland Williams, Freddie Williams and Tom Kilburn, is an early form of computer memory. It was the first Random-access memory, random-access digital storage devi ...
memory, a magnetic
drum memory Drum memory was a magnetic data storage device invented by Gustav Tauschek in 1932 in Austria. Drums were widely used in the 1950s and into the 1960s as computer memory. Many early computers, called drum computers or drum machines, used drum ...
from
Engineering Research Associates Engineering Research Associates, commonly known as ERA, was a pioneering computer firm from the 1950s. ERA became famous for their numerical computers, but as the market expanded they became better known for their drum memory systems. They were ...
was later added. This drum had 80 tracks so two words could be read at a time, and each track stored 1,024 bits. While contemplating the drum's mechanism, Slotnik began to wonder if that was the correct way to build a computer. If the bits of a word were written serially to a single track, instead of in parallel across 40 tracks, then the data could be fed into a bit-serial computer directly from the drum bit-by-bit. The drum would still have multiple tracks and heads, but instead of gathering up a word and sending it to a single ALU, in this concept the data on each track would be read a bit at a time and sent into parallel ALUs. This would be a word-parallel, bit-serial computer. Slotnick raised the idea at the IAS, but
John von Neumann John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...
dismissed it as requiring "too many tubes". Slotnick left the IAS in February 1954 to return to school to pursue his
PhD A Doctor of Philosophy (PhD, DPhil; or ) is a terminal degree that usually denotes the highest level of academic achievement in a given discipline and is awarded following a course of graduate study and original research. The name of the deg ...
degree and the matter was forgotten.


SOLOMON

After completing his PhD and some post-doctoral work, Slotnick ended up at
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
. By this time, for scientific computing at least, tubes and drums had been replaced with transistors and
magnetic-core memory In computing, magnetic-core memory is a form of random-access memory. It predominated for roughly 20 years between 1955 and 1975, and is often just called core memory, or, informally, core. Core memory uses toroids (rings) of a hard magneti ...
. The idea of parallel processors working on different streams of data from a drum no longer had the same obvious appeal. Nevertheless, further consideration showed that parallel machines could still offer significant performance in some applications; Slotnick and a colleague, John Cocke (better known as the inventor of
RISC In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a comp ...
), wrote a paper on the concept in 1958. After a short time at IBM and then a stint at
Aeronca Aircraft Aeronca, contracted from Aeronautical Corporation of America, located in Middletown, Ohio, is a US manufacturer of engine components and airframe structures for commercial aviation and the defense industry, and a former aircraft manufacturer. ...
, Slotnick ended up at Westinghouse's Air Arm division, which worked on
radar Radar is a system that uses radio waves to determine the distance ('' ranging''), direction ( azimuth and elevation angles), and radial velocity of objects relative to the site. It is a radiodetermination method used to detect and track ...
and similar systems. Under a contract from the Air Force's
Rome Air Development Center Rome Laboratory (Rome Air Development Center until 1991) is a U.S. Air Force research laboratory for " command, control, and communications" research and development and is responsible for planning and executing the USAF science and technology pr ...
(RADC), Slotnik was able to build a team to design a system with 1,024 bit-serial ALUs, known as "Processing Elements" (PE). This design was given the name SOLOMON, after King
Solomon Solomon (), also called Jedidiah, was the fourth monarch of the Kingdom of Israel (united monarchy), Kingdom of Israel and Judah, according to the Hebrew Bible. The successor of his father David, he is described as having been the penultimate ...
, who was both very wise and had 1,000 wives. The PEs would be fed instructions from a single master CPU, the "control unit" or CU. SOLOMON's CU would read instructions from memory, decode them, and then hand them off to the PEs for processing. Each PE had its own memory for holding operands and results, the PE Memory module, or PEM. The CU could access the entire memory via a dedicated
memory bus In computer architecture, a bus (historically also called a data highway or databus) is a communication system that transfers data between components inside a computer or between computers. It encompasses both hardware (e.g., wires, optica ...
, whereas the PEs could only access their own PEM. To allow results from one PE to be used as inputs in another, a separate network connected each PE to its eight closest neighbours. Several test-bed systems were constructed, including a 3-by-3 (9 PE) system and a 10-by-10 model with simplified PEs. During this period, some consideration was given to more complex PE designs, becoming a 24-bit parallel system that would be organized in a 256-by-32 arrangement. A single PE using this design was built in 1963. As the design work continued, the primary sponsor within the United States Department of Defense was killed in an accident and no further funding was forthcoming. Looking to continue development, Slotnik approached
Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory (LLNL) is a Federally funded research and development centers, federally funded research and development center in Livermore, California, United States. Originally established in 1952, the laboratory now i ...
, who at that time had been at the forefront of supercomputer purchases. They were very interested in the design, but convinced him to upgrade the current design's fixed-point math units to true
floating-point arithmetic In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
, which resulted in the SOLOMON.2 design. Livermore would not fund development, instead, they offered a contract in which they would lease the machine once it was completed. Westinghouse management considered it too risky, and shut down the team. Slotnik left Westinghouse attempting to find
venture capital Venture capital (VC) is a form of private equity financing provided by firms or funds to start-up company, startup, early-stage, and emerging companies, that have been deemed to have high growth potential or that have demonstrated high growth in ...
to continue the project, but failed. Livermore would later select the
CDC STAR-100 The CDC STAR-100 is a vector supercomputer that was designed, manufactured, and marketed by Control Data Corporation (CDC). It was one of the first machines to use a vector processor to improve performance on appropriate scientific applications. I ...
for this role, as CDC was willing to take on the development costs.


ILLIAC IV

When SOLOMON ended, Slotnick joined the Illinois Automatic Computer design (ILLIAC) team at the University of Illinois at Urbana-Champaign. Illinois had been designing and building large computers for the U.S. Department of Defense and ARPA since 1949. In 1964 the university signed a contract with ARPA to fund the effort, which became known as ILLIAC IV, since it was the fourth computer designed and created at the university. Development started in 1965, and a first-pass design was completed in 1966. In contrast to the bit-serial concept of SOLOMON, in ILLIAC IV the PEs were upgraded to be full 64-bit (bit-parallel) processors, using 12,000 gates and 2048-words of thin-film memory. The PEs had five 64-bit registers, each with a special purpose. One of these, RGR, was used for communicating data to neighbouring PEs, moving one "hop" per clock cycle. Another register, RGD, indicated whether or not that PE was currently active. "Inactive" PEs could not access memory, but they would pass results to neighbouring PEs using the RGR. The PEs were designed to work as a single 64-bit FPU, two 32-bit half-precision FPUs, or eight 8-bit fixed-point processors. Instead of 1,024 PEs and a single CU, the new design had a total of 256 PEs arranged into four 64-PE "quadrants", each with its own CU. The CU's were also 64-bit designs, with sixty-four 64-bit registers and another four 64-bit accumulators. The system could run as four separate 64-PE machines, two 128-PE machines, or a single 256-PE machine. This allowed the system to work on different problems when the data was too small to demand the entire 256-PE array. Based on a 25 MHz clock, with all 256-PEs running on a single program, the machine was designed to deliver 1 billion floating point operations per second, or in today's terminology, 1 
GFLOPS Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measu ...
. This made it much faster than any machine in the world; the contemporary
CDC 7600 The CDC 7600 was designed by Seymour Cray to be the successor to the CDC 6600, extending Control Data Corporation, Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz (27.5 ns clock cycle) and had ...
had a clock cycle of 27.5 nanoseconds, or 36 MIPS, although for a variety of reasons it generally offered performance closer to 10 MIPS. To support the machine, an extension to the Digital Computer Laboratory buildings were constructed. Sample work at the university was primarily aimed at ways to efficiently fill the PEs with data, thus conducting the first "stress test" in computer development. In order to make this as easy as possible, several new
computer language A computer language is a formal language used to communicate with a computer. Types of computer languages include: * Software construction#Construction languages, Construction language – all forms of communication by which a human can Comput ...
s were created; IVTRAN and TRANQUIL were parallelized versions of FORTRAN, and Glypnir was a similar conversion of
ALGOL ALGOL (; short for "Algorithmic Language") is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the ...
. Generally, these languages provided support for loading arrays of data "across" the PEs to be executed in parallel, and some even supported the unwinding of loops into array operations.


Construction, problems

In early 1966, the university sent out a request for proposals looking for industrial partners interested in building the design. Seventeen responses were received in July, seven responded, and of these three were selected. Several of the responses, including
Control Data Control Data Corporation (CDC) was a mainframe and supercomputer company that in the 1960s was one of the nine major U.S. computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), ...
, attempted to interest them in a
vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
design instead, but as these were already being designed the team was not interested in building another. In August 1966, eight-month contracts were offered to
RCA RCA Corporation was a major American electronics company, which was founded in 1919 as the Radio Corporation of America. It was initially a patent pool, patent trust owned by General Electric (GE), Westinghouse Electric Corporation, Westinghou ...
, Burroughs, and
UNIVAC UNIVAC (Universal Automatic Computer) was a line of electronic digital stored-program computers starting with the products of the Eckert–Mauchly Computer Corporation. Later the name was applied to a division of the Remington Rand company and ...
to bid on the construction of the machine. Burroughs eventually won the contract, having teamed up with
Texas Instruments Texas Instruments Incorporated (TI) is an American multinational semiconductor company headquartered in Dallas, Texas. It is one of the top 10 semiconductor companies worldwide based on sales volume. The company's focus is on developing analog ...
(TI). Both offered new technical advances that made their bid the most interesting. Burroughs was offering to build a new and much faster version of thin-film memory which would improve performance. TI was offering to build 64-pin
emitter-coupled logic In electronics, emitter-coupled logic (ECL) is a high-speed integrated circuit bipolar transistor logic family. ECL uses a bipolar junction transistor (BJT) differential amplifier with single-ended input and limited emitter current to avoid th ...
(ECL)
integrated circuit An integrated circuit (IC), also known as a microchip or simply chip, is a set of electronic circuits, consisting of various electronic components (such as transistors, resistors, and capacitors) and their interconnections. These components a ...
s (ICs) with 20
logic gate A logic gate is a device that performs a Boolean function, a logical operation performed on one or more binary inputs that produces a single binary output. Depending on the context, the term may refer to an ideal logic gate, one that has, for ...
s each. At the time, most ICs used 16-pin packages and had between four and seven gates. Using TI's ICs would make the system much smaller. Burroughs also supplied the specialized
disk drives Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are cons ...
, which featured a separate fixed head for every track and could offer speeds up to 500 Mbit/s and stored about 80  MB per 36-inch disk. They would also provide a Burroughs
B6500 The Burroughs Large Systems Group produced a family of large 48-bit computing, 48-bit mainframe computer, mainframes using stack machine instruction sets with dense Syllable (computing), syllables.E.g., 12-bit syllables for B5000, 8-bit syllables f ...
mainframe to act as a front-end controller, loading data from secondary storage and performing other housekeeping tasks. Connected to the B6500 was a third-party laser optical recording medium, a write-once system that stored up to 1 
Tbit The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as ...
on thin metal film coated on a strip of polyester sheet carried by a rotating drum. Construction of the new design began at Burroughs' Great Valley Lab. At the time, it was estimated the machine would be delivered in early 1970. After a year of working on the ICs, TI announced they had failed to be able to build the 64-pin designs. The more complex internal wiring was causing
crosstalk In electronics, crosstalk (XT) is a phenomenon by which a signal transmitted on one circuit or channel of a transmission system creates an undesired effect in another circuit or channel. Crosstalk is usually caused by undesired capacitive, ...
in the circuitry, and they asked for another year to fix the problems. Instead, the ILLIAC team chose to redesign the machine based on available 16-pin ICs. This required the system to run slower, using a 16 MHz clock instead of the original 25 MHz. The change from 64-pin to 16-pin cost the project about two years, and millions of dollars. TI was able to get the 64-pin design working after just over another year, and began offering them on the market before ILLIAC was complete. As a result of this change, the individual
printed circuit board A printed circuit board (PCB), also called printed wiring board (PWB), is a Lamination, laminated sandwich structure of electrical conduction, conductive and Insulator (electricity), insulating layers, each with a pattern of traces, planes ...
s (PCBs) grew about square to about . This doomed Burroughs' efforts to produce a thin-film memory for the machine, because there was now no longer enough space for the memory to fit within the design's cabinets. Attempts to increase the size of the cabinets to make room for the memory caused serious problems with signal propagation. Slotnick surveyed the potential replacements and picked a semiconductor memory from
Fairchild Semiconductor Fairchild Semiconductor International, Inc. was an American semiconductor company based in San Jose, California. It was founded in 1957 as a division of Fairchild Camera and Instrument by the " traitorous eight" who defected from Shockley Semi ...
, a decision that was so opposed by Burroughs that a full review by ARPA followed. In 1969, these problems, combined with the resulting cost overruns from the delays, led to the decision to build only a single 64-PE quadrant, thereby limiting the machine's speed to about 200 MFLOPS. Together, these changes cost the project three years and $6 million. By 1969, the project was spending $1 million a month, and had to be spun out of the original ILLIAC team who were becoming increasingly vocal in their opposition to the project.


Move to Ames

By 1970, the machine was finally being built at a reasonable rate and it was being readied for delivery in about a year. On 6 January 1970, '' The Daily Illini'', the student newspaper, claimed that the computer would be used to design nuclear weapons. In May, the
Kent State shootings The Kent State shootings (also known as the Kent State massacre or May 4 massacre"These would be the first of many probes into what soon became known as the Kent State Massacre. Like the Boston Massacre almost exactly two hundred years before (Ma ...
took place, and anti-war violence erupted across university campuses. Slotnick grew to be opposed to the use of the machine on classified research, and announced that as long as it was on the university grounds that all processing that took place on the machine would be publicly released. He also grew increasingly concerned that the machine would be subject to attack by the more radical student groups. a position that seemed wise after the local students joined the 9 May 1970 nationwide student strike by declaring a "day of Illiaction", and especially the 24 August bombing of the mathematics building at the
University of Wisconsin–Madison The University of Wisconsin–Madison (University of Wisconsin, Wisconsin, UW, UW–Madison, or simply Madison) is a public land-grant research university in Madison, Wisconsin, United States. It was founded in 1848 when Wisconsin achieved st ...
. With the help of
Hans Mark Hans Michael Mark (June 17, 1929 – December 18, 2021) was a German-born American government official who served as Secretary of the Air Force and as a Deputy Administrator of NASA. He was an expert and consultant in aerospace design and natio ...
, the director of the NASA Ames Research Center in what was becoming
Silicon Valley Silicon Valley is a region in Northern California that is a global center for high technology and innovation. Located in the southern part of the San Francisco Bay Area, it corresponds roughly to the geographical area of the Santa Clara Valley ...
, in January 1971 the decision was made to deliver the machine to Ames rather than the university. Located on an active
United States Navy The United States Navy (USN) is the naval warfare, maritime military branch, service branch of the United States Department of Defense. It is the world's most powerful navy with the largest Displacement (ship), displacement, at 4.5 millio ...
base and protected by the U.S. Marines, security would no longer be a concern. The machine was finally delivered to Ames in April 1972, and installed in the Central Computer Facility in building N-233. By this point it was several years late and well over budget at a total price of $31 million, almost four times the original estimate of $8 million for the complete 256-PE machine. NASA also decided to replace the B6500 front-end machine with a Digital Equipment Corporation
PDP-10 Digital Equipment Corporation (DEC)'s PDP-10, later marketed as the DECsystem-10, is a mainframe computer family manufactured beginning in 1966 and discontinued in 1983. 1970s models and beyond were marketed under the DECsystem-10 name, especi ...
, which were in common use at Ames and would make it much easier to connect to the ARPAnet. This required the development of new software, especially compilers, on the PDP-10. This caused further delays in bringing the machine online. The Illiac IV was contracted to be managed by ACTS Computing Corporation, headquartered in Southfield, Michigan, a Timesharing and Remote Job Entry (RJE) company that had recently been acquired by the conglomerate, Lear Siegler Corporation. The DoD contracted with ACTS under a cost-plus-10% contract. This unusual arrangement was due to the constraint that no government employee could be paid more than a Congress person and many Illiac IV personnel made more than that limit. Dr. Mel Pirtle, with a background from the University of California, Berkeley and the Berkeley Computer Corporation (BCC) was engaged as the Illiac IV's director.


Making it work

When the machine first arrived, it could not be made to work. It suffered from all sorts of problems from cracking PCBs, to bad
resistor A resistor is a passive two-terminal electronic component that implements electrical resistance as a circuit element. In electronic circuits, resistors are used to reduce current flow, adjust signal levels, to divide voltages, bias active e ...
s, to the packaging of the TI ICs being highly sensitive to humidity. These issues were slowly addressed, and by the summer of 1973 the first programs were able to be run on the system although the results were highly questionable. Starting in June 1975, a concerted four-month effort began that required, among other changes, replacing 110,000 resistors, rewiring parts to fix propagation delay issues, improving filtering in the power supplies, and a further reduction in clock speed to 13 MHz. At the end of this process, the system was finally working properly. From then on, the system ran Monday morning to Friday afternoon, providing 60 hours of up-time for the users, but requiring 44 hours of scheduled downtime. Nevertheless, it was increasingly used as NASA programmers learned ways to get performance out of the complex system. At first, performance was dismal, with most programs running at about 15 MFLOPS, about three times the average for the
CDC 7600 The CDC 7600 was designed by Seymour Cray to be the successor to the CDC 6600, extending Control Data Corporation, Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz (27.5 ns clock cycle) and had ...
. Over time this improved, notably after Ames programmers wrote their own version of FORTRAN, CFD, and learned how to parallel I/O into the limited PEMs. On problems that could be parallelized the machine was still the fastest in the world, outperforming the CDC 7600 by two to six times, and it is generally credited as the fastest machine in the world until 1981. On 7 September 1981, after nearly 10 years of operation, the ILLIAC IV was turned off.'This Day in History: September 7', Computer History Museum The machine was officially decommissioned in 1982, and NASA's advanced computing division ended with it. One control unit and one processing element chassis from the machine is now on display at the
Computer History Museum The Computer History Museum (CHM) is a computer museum in Mountain View, California. The museum presents stories and artifacts of Silicon Valley and the Information Age, and explores the Digital Revolution, computing revolution and its impact ...
in Mountain View, less than a mile from its operational site.


Aftermath

ILLIAC was very late, very expensive, and never met its goal of producing 1 GFLOP. It was widely considered a failure even by those who worked on it; one stated simply that "any impartial observer has to regard ILLIAC IV as a failure in a technical sense." In terms of project management it is widely regarded as a failure, running over its cost estimates by four times and requiring years of remedial efforts to make it work. As Slotnik later put it: However, later analyses note that the project had several long-lasting effects on the computer market as a whole, both intentionally and unintentionally. Among the indirect effects was the rapid improvement of semiconductor memory after the ILLIAC project. Slotnick received a lot of criticism when he chose
Fairchild Semiconductor Fairchild Semiconductor International, Inc. was an American semiconductor company based in San Jose, California. It was founded in 1957 as a division of Fairchild Camera and Instrument by the " traitorous eight" who defected from Shockley Semi ...
to produce the memory ICs, as at the time the production line was an empty room and the design existed only on paper. However, after three months of intense effort, Fairchild had a working design being produced ''en masse''. As Slotnick would later comment, "Fairchild did a magnificent job of pulling our chestnuts out of the fire. The Fairchild memories were superb and their reliability to this day is just incredibly good." ILLIAC is considered to have dealt a death blow to
magnetic-core memory In computing, magnetic-core memory is a form of random-access memory. It predominated for roughly 20 years between 1955 and 1975, and is often just called core memory, or, informally, core. Core memory uses toroids (rings) of a hard magneti ...
and related systems like thin-film. Another indirect effect was caused by the complexity of the PCBs, or modules. At the original 25 MHz design speed, impedance in the ground wiring proved to be a serious problem, demanding that the PCBs be as small as possible. As their complexity grew, the PCBs had to add more and more layers in order to avoid growing larger. Eventually, they reached 15 layers deep, which proved to be well beyond the capabilities of draftsmen. The design was ultimately completed using new automated design tools provided by a subcontractor, and the complete design required two years of computer time on a Burroughs mainframe. This was a major step forward in
computer-aided design Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve c ...
, and by the mid-1970s such tools were commonplace. ILLIAC also led to major research into the topic of parallel processing that had wide-ranging effects. During the 1980s, with the price of microprocessors falling according to Moore's Law, a number of companies created
MIMD In computing, multiple instruction, multiple data (MIMD) is a technique employed to achieve parallelism. Machines using MIMD have a number of processor cores that function asynchronously and independently. At any time, different processors may b ...
(Multiple Instruction, Multiple Data) to build even more parallel machines, with compilers that could make better use of the parallelism. The
Thinking Machines Thinking Machines Corporation was a supercomputer manufacturer and artificial intelligence (AI) company, founded in Waltham, Massachusetts, in 1983 by Sheryl Handler and W. Daniel "Danny" Hillis to turn Hillis's doctoral work at the Massachuse ...
CM-5 is an excellent example of the MIMD concept. It was the better understanding of parallelism on ILLIAC that led to the improved compilers and programs that could take advantage of these designs. As one ILLIAC programmer put it, "If anybody builds a fast computer out of a lot of microprocessors, Illiac IV will have done its bit in the broad scheme of things." Most supercomputers of the era took another approach to higher performance, using a single very-high-speed
vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
. Similar to the ILLIAC in some ways, these processor designs loaded up many data elements into a single custom processor instead of a large number of specialized ones. The classic example of this design is the
Cray-1 The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. Announced in 1975, the first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Eventually, eighty Cray-1s were sold, making it one of the ...
, which had performance similar to the ILLIAC. There was more than a little "backlash" against the ILLIAC design as a result, and for some time the supercomputer market looked on massively parallel designs with disdain, even when they were successful. As
Seymour Cray Seymour Roger Cray (September 28, 1925 – October 5, 1996)
was an American


Description


Physical arrangement

Each quadrant of the machine was high, deep and long. Arranged beside the quadrant was its
input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
(I/O) system, whose disk system stored 2.5 
GiB Gib or GIB may refer to: Places * Gibraltar, British overseas territory ** Gibraltar International Airport, IATA code GIB ** Gibraltar national football team, FIFA code GIB * Mount Gibraltar, known as The Gib, New South Wales, Australia * Gib ...
and could read and write data at 1 billion 
bits per second In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction ...
, along with the B6700 computer that connected to the machine through the same 1,024-bit-wide interface as the disk system. The machine consisted of a series of carrier chassis holding a number of the small modules. The majority of these were the Processing Units (PUs), which contained the modules for a single PE, its PEM, and the Memory Logic Unit that handled address translation and I/O. The PUs were identical, so they could be replaced or reordered as required.


Processor details

Each CU had about 30 to 40,000 gates. The CU had sixteen 64-bit registers and a separate sixty-four slot 64-bit "scratchpad", LDB. There were four accumulators, AC0 through AC3, a program counter ILR, and various control registers. The system had a short
instruction pipeline In computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming Mac ...
and implemented instruction look-ahead. The PEs had about 12,000 gates. It included four 64-bit registers, using an accumulator A, an operand buffer B and a secondary scratchpad S. The fourth, R, was used to broadcast or receive data from the other PEs. The PEs used a
carry-lookahead adder A carry-lookahead adder (CLA) or fast adder is a type of Adder (electronics), electronics adder used in digital logic. A carry-lookahead adder improves speed by reducing the amount of time required to determine carry bits. It can be contrasted w ...
, a leading-one detector for Boolean operations, and a
barrel shifter A barrel shifter is a digital circuit that can bit shift, shift a word (data type), data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. I ...
. 64-bit additions took about 200 ns and multiplications about 400 ns. The PEs were connected to a private memory bank, the PEM, which held 2,048 64-bit words. Access time was on the order of 250 ns The PEs used a
load–store architecture In computer engineering, a load–store architecture (or a register–register architecture) is an instruction set architecture that divides instructions into two categories: memory access ( load and store between memory and registers) and ALU op ...
. The
instruction set architecture In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, ...
(ISA) contained two separate sets of instructions, one for the CU (or a unit within it, ADVAST) and another for the PEs. Instructions for the PEs were not decoded, and instead sent directly to the FINST register to be sent to the PEs to process. The ADVAST instructions were decoded and entered the CU's processing pipeline.


Logical arrangement

Each quadrant contained 64 PEs and one CU. The CU had access to the entire I/O bus and could address all of the machine's memory. The PEs could only access their own local store, the PEM, of 2,048 64-bit words. Both the PEs and CU could use load and store operations to access the disk system. The cabinets were so large that it required 240  ns for signals to travel from one end to the other. For this reason, the CU could not be used to coordinate actions, instead, the entire system was clock-synchronous with all operations in the PEs guaranteed to take the same amount of time no matter what the operands were. That way the CU could be sure that the operations were complete without having to wait for results or status codes. To improve the performance of operations that required the output of one PE's results to be used as the input to another PE, the PEs were connected directly to their neighbors, as well as the ones eight-steps away — for instance, PE1 was directly connected to PE0 and PE2, as well as PE9 and PE45. The eight-away connections allowed faster transport when the data needed to travel between more distant PEs. Each shift of data moved 64-words in a single 125 ns clock cycle. The system used a one-address format, in which the instructions included the address of one of the operands and the other operand was in the PE's accumulator (the A register). The address was sent to the PEs over a separate "broadcast" bus. Depending on the instruction, the value on the bus might refer to a memory location in the PE's PEM, a value in one of the PE registers, or a numeric constant. Since each PE had its own memory, while the instruction format and the CUs saw the entire address space, the system included an
index register An index register in a computer's central processing unit, CPU is a processor register (or an assigned memory location) used for pointing to operand addresses during the run of a program. It is useful for stepping through String (computer science ...
(X) to offset the base address. This allowed, for example, the same instruction stream to work on data that was not aligned in the same locations in different PEs. The common example would be an array of data that was loaded into different locations in the PEMs, which could then be made uniform by setting the index in the different PEs.


Branches

In traditional computer designs, instructions are loaded into the CPU one at a time as they are read from memory. Normally, when the CPU completes processing an instruction, the
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
(PC) is incremented by one word and the next instruction is read. This process is interrupted by
branches A branch, also called a ramus in botany, is a stem that grows off from another stem, or when structures like veins in leaves are divided into smaller veins. History and etymology In Old English, there are numerous words for branch, includi ...
, which causes the PC to jump to one of two locations depending on a test, like whether a given memory address holds a non-zero value. In the ILLIAC design, each PE would be applying this test to different values, and thus have different outcomes. Since those values are private to the PE, the following instructions would need to be loaded based on a value only the PE knew. To avoid the delays reloading the PE instructions would cause, the ILLIAC loaded the PEMs with the instructions on both sides of the branch. Logical tests did not change the PC, instead, they set "mode bits" that told the PE whether or not to run the next arithmetic instruction. To use this system, the program would be written so that one of the two possible instruction streams followed the test, and ended with an instruction to invert the bits. Code for the second branch would then follow, ending with an instruction to set all the bits to 1. If the test selected the "first" branch, that PE would continue on as normal. When it reached the end of that code, the mode operator instruction would flip the mode bits, and from then on that PE would ignore further instructions. This would continue until it reached the end of the code for the second branch, where the mode reset instruction would turn the PE back on. If a particular PE's test resulted in the second branch being taken, it would instead set the mode bits to ignore further instructions until it reached the end of the first branch, where the mode operator would flip the bits and cause the second branch to begin processing, once again turning them all on at the end of that branch. Since the PEs can operate in 64-, 32- and 8-bit modes, the mode flags had multiple bits so the individual words could be turned on or off. For instance, in the case when the PE was operating in 32-bit mode, one "side" of the PE might have the test come out true while the other side was false.


Terminology

* CU: control unit * CPU: central processing unit * ISA: instruction set architecture * MAC: multiply-and-accumulate * PC: program counter * PCB: printed circuit board * PE: processing element * PEM: processing element memory module * PU: processing unit


See also

*
Amdahl's law In computer architecture, Amdahl's law (or Amdahl's argument) is a formula that shows how much faster a task can be completed when more resources are added to the system. The law can be stated as: "the overall performance improvement gained by ...
, which suggests there are limits to the performance increase of parallel computers * ILLIAC III, a special-purpose SIMD machine built around the same time as ILLIAC IV * Parallel Element Processing Ensemble, another massively-parallel Burroughs machine, this one a
Bell Labs Nokia Bell Labs, commonly referred to as ''Bell Labs'', is an American industrial research and development company owned by Finnish technology company Nokia. With headquarters located in Murray Hill, New Jersey, Murray Hill, New Jersey, the compa ...
design *
Bull Gamma 60 The Bull Gamma 60 was a large transistorized mainframe computer designed by Compagnie des Machines Bull. Initially announced in 1957, the first unit shipped in 1960. It holds the distinction of being the world's first Multithreading (computer arc ...
, an early parallel computer released in 1960


Notes


References


Citations


Bibliography

* * * * * * * * * * *


Further reading


ILLIAC IV CFD



External links


ILLIAC IV documentation
at bitsavers.org
Oral history interview with Ivan Sutherland
Charles Babbage Institute The IT History Society (ITHS) is an organization that supports the history and scholarship of information technology by encouraging, fostering, and facilitating archival and historical research. Formerly known as the Charles Babbage Foundation, ...
, University of Minnesota.
Sutherland Sutherland () is a Counties of Scotland, historic county, registration county and lieutenancy areas of Scotland, lieutenancy area in the Scottish Highlands, Highlands of Scotland. The name dates from the Scandinavian Scotland, Viking era when t ...
describes his tenure from 1963 to 1965 as head of the
Information Processing Techniques Office The Information Processing Techniques Office (IPTO), originally "Command and Control Research",Lyon, Matthew; Hafner, Katie (1999-08-19). ''Where Wizards Stay Up Late: The Origins Of The Internet'' (p. 39). Simon & Schuster. Kindle Edition. was par ...
(IPTO) and new initiatives, such as ILLIAC IV. *  — panel discussion at
Computer History Museum The Computer History Museum (CHM) is a computer museum in Mountain View, California. The museum presents stories and artifacts of Silicon Valley and the Information Age, and explores the Digital Revolution, computing revolution and its impact ...
, June 24, 1997. {{Mainframes Massively parallel computers One-of-a-kind computers Parallel computing Supercomputers